Goto

Collaborating Authors

 representational power





The Limitations of Large Width in Neural Networks: A Deep Gaussian Process Perspective

Neural Information Processing Systems

Large width limits have been a recent focus of deep learning research: modulo computational practicalities, do wider networks outperform narrower ones? Answering this question has been challenging, as conventional networks gain representational power with width, potentially masking any negative effects.



meta learning-based frameworks adaptively balance auxiliary tasks (meta-path prediction) with the primary task (link

Neural Information Processing Systems

We thank all four reviewers for unanimous support for the paper and constructive comments. Overall, reviewers are positive about our contributions: [R5] "The proposed framework is very general and It is the'first' paper to do so.", The overall quality of this paper is good." Is the proposed method applicable to existing heterogeneous GNNs such as GTNs [1]? Y es, as [R4] pointed out, our framework can be applied to any GNNs in a plug-in manner.


Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems

"NIPS Neural Information Processing Systems 8-11th December 2014, Montreal, Canada",,, "Paper ID:","1380" "Title:","Do Deep Nets Really Need to be Deep?" First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. The authors show empirical results on TIMIT and CIFAR-10 that shallow nets trained to mimic the outputs of DNNs and CNNs achieve comparable accuracy on these tasks. The paper is clearly written and makes compelling arguments. The contribution is significant because it suggests that SNNs are capable of learning complex functions that were thought to be learnable only with DNNs or CNNs. This means that better training algorithms have yet to be devised for SNNs.


Stochastic Normalizing Flows

Neural Information Processing Systems

By invoking ideas from non-equilibrium statistical mechanics we derive an efficient training procedure by which both the sampler's and the flow's parameters can be optimized end-to-end, and by which we can compute


Time to Spike? Understanding the Representational Power of Spiking Neural Networks in Discrete Time

arXiv.org Artificial Intelligence

Recent years have seen significant progress in developing spiking neural networks (SNNs) as a potential solution to the energy challenges posed by conventional artificial neural networks (ANNs). However, our theoretical understanding of SNNs remains relatively limited compared to the ever-growing body of literature on ANNs. In this paper, we study a discrete-time model of SNNs based on leaky integrate-and-fire (LIF) neurons, referred to as discrete-time LIF-SNNs, a widely used framework that still lacks solid theoretical foundations. We demonstrate that discrete-time LIF-SNNs with static inputs and outputs realize piecewise constant functions defined on polyhedral regions, and more importantly, we quantify the network size required to approximate continuous functions. Moreover, we investigate the impact of latency (number of time steps) and depth (number of layers) on the complexity of the input space partitioning induced by discrete-time LIF-SNNs. Our analysis highlights the importance of latency and contrasts these networks with ANNs employing piecewise linear activation functions. Finally, we present numerical experiments to support our theoretical findings.


Interpretable non-linear dimensionality reduction using gaussian weighted linear transformation

arXiv.org Artificial Intelligence

Dimensionality reduction techniques are fundamental for analyzing and visualizing high-dimensional data. With established methods like t-SNE and PCA presenting a trade-off between representational power and interpretability. This paper introduces a novel approach that bridges this gap by combining the interpretability of linear methods with the expressiveness of non-linear transformations. The proposed algorithm constructs a non-linear mapping between high-dimensional and low-dimensional spaces through a combination of linear transformations, each weighted by Gaussian functions. This architecture enables complex non-linear transformations while preserving the interpretability advantages of linear methods, as each transformation can be analyzed independently. The resulting model provides both powerful dimensionality reduction and transparent insights into the transformed space. Techniques for interpreting the learned transformations are presented, including methods for identifying suppressed dimensions and how space is expanded and contracted. These tools enable practitioners to understand how the algorithm preserves and modifies geometric relationships during dimensionality reduction. To ensure the practical utility of this algorithm, the creation of user-friendly software packages is emphasized, facilitating its adoption in both academia and industry.